173 research outputs found

    Characteristic Length Scale of Electric Transport Properties of Genomes

    Full text link
    A tight-binding model together with a novel statistical method are used to investigate the relation between the sequence-dependent electric transport properties and the sequences of protein-coding regions of complete genomes. A correlation parameter Ω\Omega is defined to analyze the relation. For some particular propagation length wmaxw_{max}, the transport behaviors of the coding and non-coding sequences are very different and the correlation reaches its maximal value Ωmax\Omega_{max}. wmaxw_{max} and \omax are characteristic values for each species. The possible reason of the difference between the features of transport properties in the coding and non-coding regions is the mechanism of DNA damage repair processes together with the natural selection.Comment: 4 pages, 4 figure

    Statistical analysis of the DNA sequence of human chromosome 22

    Get PDF
    We study statistical patterns in the DNA sequence of human chromosome 22, the first completely sequenced human chromosome. We find that (i) the 33.4 x 10(6) nucleotide long human chromosome exhibits long-range power-law correlations over more than four orders of magnitude, (ii) the entropies H-n of the frequency distribution of oligonucleotides of length n (n-mers) grow sublinearly with increasing n, indicating the presence of higher-order correlations for all of the studied lengths 1 less than or equal to n less than or equal to 10, and (iii) the generalized entropies H-n(q) of n-mers decrease monotonically with increasing q and the decay of H-n(q) with q becomes steeper with increasing n less than or equal to 10, indicating that the frequency distribution of oligonucleotides becomes increasingly nonuniform as the length n increases. We investigate to what degree known biological features may explain the observed statistical patterns. We find that (iv) the presence of interspersed repeats may cause the sublinear increase of H-n with n, and that (v) the presence of monomeric tandem repeats as well as the suppression of CG dinucleotides may cause the observed decay of H-n(q) with q

    Single-photon single ionization of W+^{+} ions: experiment and theory

    Full text link
    Experimental and theoretical results are reported for photoionization of Ta-like (W+^{+}) tungsten ions. Absolute cross sections were measured in the energy range 16 to 245 eV employing the photon-ion merged-beam setup at the Advanced Light Source in Berkeley. Detailed photon-energy scans at 100 meV bandwidth were performed in the 16 to 108 eV range. In addition, the cross section was scanned at 50 meV resolution in regions where fine resonance structures could be observed. Theoretical results were obtained from a Dirac-Coulomb R-matrix approach. Photoionization cross section calculations were performed for singly ionized atomic tungsten ions in their 5s25p65d4(5D)6s  6DJ5s^2 5p^6 5d^4({^5}D)6s \; {^6}{\rm D}_{J}, JJ=1/2, ground level and the associated excited metastable levels with JJ=3/2, 5/2, 7/2 and 9/2. Since the ion beams used in the experiments must be expected to contain long-lived excited states also from excited configurations, additional cross-section calculations were performed for the second-lowest term, 5d^5 \; ^6{\rm S}_{J}, JJ=5/2, and for the 4^4F term, 5d^3 6s^2 \; ^4{\rm F}_{J}, with JJ = 3/2, 5/2, 7/2 and 9/2. Given the complexity of the electronic structure of W+^+ the calculations reproduce the main features of the experimental cross section quite well.Comment: 23 pages, 7 figures, 1 table: Accepted for publication in J. Phys. B: At. Mol. & Opt. Phy

    Finite-sample frequency distributions originating from an equiprobability distribution

    Full text link
    Given an equidistribution for probabilities p(i)=1/N, i=1..N. What is the expected corresponding rank ordered frequency distribution f(i), i=1..N, if an ensemble of M events is drawn?Comment: 4 pages, 4 figure

    Bauwissen im Italien der Frühen Neuzeit

    Get PDF

    Entropy estimates of small data sets

    Full text link
    Estimating entropies from limited data series is known to be a non-trivial task. Naive estimations are plagued with both systematic (bias) and statistical errors. Here, we present a new 'balanced estimator' for entropy functionals Shannon, R\'enyi and Tsallis) specially devised to provide a compromise between low bias and small statistical errors, for short data series. This new estimator out-performs other currently available ones when the data sets are small and the probabilities of the possible outputs of the random variable are not close to zero. Otherwise, other well-known estimators remain a better choice. The potential range of applicability of this estimator is quite broad specially for biological and digital data series.Comment: 11 pages, 2 figure

    Local Renyi entropic profiles of DNA sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs.</p> <p>Results</p> <p>The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at <url>http://kdbio.inesc-id.pt/~svinga/ep/</url>.</p> <p>Conclusion</p> <p>The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.</p
    corecore